Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
360 video streaming presents unique challenges in bandwidth efficiency and motion-to-photon (MTP) latency, particularly for live multi-user scenarios. While viewport prediction (VP) has emerged as the dominant solution, its effectiveness in live streaming is limited by training data scarcity and the unpredictability of live content. We present 360LIVECAST, the first practical multicast framework for live 360 video that eliminates the need for VP through two key innovations: (1) a novel viewport hull representation that combines current viewports with marginal regions, enabling local frame synthesis while reducing bandwidth by 60% compared to full panorama transmission, and (2) an viewport-specific hierarchical multicast framework leveraging edge computing to handle viewer dynamics while maintaining sub-25ms MTP latency. Extensive evaluation using real-world network traces and viewing trajectories demonstrates that 360LIVECAST achieves 26.9% lower latency than VP-based approaches while maintaining superior scalability.more » « lessFree, publicly-accessible full text available August 5, 2026
-
Multi-camera systems are essential in movies, live broadcasts, and other media. The selection of the appropriate camera for every moment has a decisive impact on production quality and audience preferences. Learning-based multi-camera view recommendation frameworks have been explored to assist professionals in decision making. This work explores how two standard cinematography practices could be incorporated into the learning pipeline: (1) not staying on the same camera for too long and (2) introducing a scene from a wider shot and gradually progressing to narrower ones. In these regards, we incorporate (1) the duration of the displaying camera and (2) camera identity as temporal and camera embedding in a transformer architecture, thereby implicitly guiding the model to learn the two practices from professional-labeled data. Experiments show that the proposed framework outperforms the baseline by 14.68% in six-way classification accuracy. Ablation studies on different approaches to embedding the temporal and camera information further verify the efficacy of the framework.more » « lessFree, publicly-accessible full text available June 23, 2026
-
We propose a two-stage estimation procedure for a copula-based model with semi-competing risks data, where the non-terminal event is subject to dependent censoring by the terminal event, and both events are subject to independent censoring. With a copula-based model, the marginal survival functions of individual event times are specified by semiparametric transformation models, and the dependence between the bivariate event times is specified by a parametric copula function. For the estimation procedure, in the first stage, the parameters associated with the marginal of the terminal event are estimated using only the corresponding observed outcomes, and in the second stage, the marginal parameters for the non-terminal event time and the copula parameter are estimated together via maximizing a pseudo-likelihood function based on the joint distribution of the bivariate event times. We derived the asymptotic properties of the proposed estimator and provided an analytic variance estimator for inference. Through simulation studies, we showed that our approach leads to consistent estimates with less computational cost and more robustness than the one-stage procedure developed in Chen (2012), where all parameters were estimated simultaneously. In addition, our approach demonstrates more desirable finite-sample performances over another existing two-stage estimation method proposed in Zhu et al. (2021). An R package PMLE4SCR is developed to implement our proposed method.more » « lessFree, publicly-accessible full text available January 1, 2026
-
Multi-camera systems are indispensable in movies, TV shows, and other media. Selecting the appropriate camera at every timestamp has a decisive impact on production quality and audience preferences. Learning-based view recommendation frameworks can assist professionals in decision-making. However, they often struggle outside of their training domains. The scarcity of labeled multi-camera view recommendation datasets exacerbates the issue. Based on the insight that many videos are edited from the original multi-camera videos, we propose transforming regular videos into pseudo-labeled multi-camera view recommendation datasets. Promisingly, by training the model on pseudo-labeled datasets stemming from videos in the target domain, we achieve a 68% relative improvement in the model’s accuracy in the target domain and bridge the accuracy gap between in-domain and never-before-seen domains.more » « lessFree, publicly-accessible full text available December 8, 2025
-
Various goodness-of-fit tests are designed based on the so-called information matrix equivalence: if the assumed model is correctly specified, two information matrices that are derived from the likelihood function are equivalent. In the literature, this principle has been established for the likelihood function with fully observed data, but it has not been verified under the likelihood for censored data. In this manuscript, we prove the information matrix equivalence in the framework of semiparametric copula models for multivariate censored survival data. Based on this equivalence, we propose an information ratio (IR) test for the specification of the copula function. The IR statisticis constructed via comparing consistent estimates of the two information matrices. We derive the asymptotic distribution of the IR statistic and propose a parametric bootstrap procedure for the finite-sample P-value calculation. The performance of the IR test is investigated via a simulation study and a real data example.more » « less
-
Recent advances in computer vision algorithms and video streaming technologies have facilitated the development of edge-server-based video analytics systems, enabling them to process sophisticated real-world tasks, such as traffic surveillance and workspace monitoring. Meanwhile, due to their omnidirectional recording capability, 360-degree cameras have been proposed to replace traditional cameras in video analytics systems to offer enhanced situational awareness. Yet, we found that providing an efficient 360-degree video analytics framework is a non-trivial task. Due to the higher resolution and geometric distortion in 360-degree videos, existing video analytics pipelines fail to meet the performance requirements for end-to-end latency and query accuracy. To address these challenges, we introduce the innovative ST-360 framework specifically designed for 360-degree video analytics. This framework features a spatial-temporal filtering algorithm that optimizes both data transmission and computational workloads. Evaluation of the ST-360 framework on a unique dataset of 360-degree first-responders videos reveals that it yields accurate query results with a 50% reduction in end-to-end latency compared to state-of-the-art methods.more » « less
-
Bulterman_Dick; Kankanhalli_Mohan; Muehlhaueser_Max; Persia_Fabio; Sheu_Philip; Tsai_Jeffrey (Ed.)The emergence of 360-video streaming systems has brought about new possibilities for immersive video experiences while requiring significantly higher bandwidth than traditional 2D video streaming. Viewport prediction is used to address this problem, but interesting storylines outside the viewport are ignored. To address this limitation, we present SAVG360, a novel viewport guidance system that utilizes global content information available on the server side to enhance streaming with the best saliency-captured storyline of 360-videos. The saliency analysis is performed offline on the media server with powerful GPU, and the saliency-aware guidance information is encoded and shared with clients through the Saliency-aware Guidance Descriptor. This enables the system to proactively guide users to switch between storylines of the video and allow users to follow or break guided storylines through a novel user interface. Additionally, we present a viewing mode prediction algorithms to enhance video delivery in SAVG360. Evaluation of user viewport traces in 360-videos demonstrate that SAVG360 outperforms existing tiled streaming solutions in terms of overall viewport prediction accuracy and the ability to stream high-quality 360 videos under bandwidth constraints. Furthermore, a user study highlights the advantages of our proactive guidance approach over predicting and streaming of where users look.more » « less
An official website of the United States government
